perm filename SCHLUM.RPT[AM,DBL] blob sn#534510 filedate 1980-09-12 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Below are some thoughts on extending the dip-meter project.  They
C00019 ENDMK
CāŠ—;
Below are some thoughts on extending the dip-meter project.  They
are techniques which will remedy (or, if they are used early enough,
avoid) many of the problems incumbent upon building a very large
rule-based expert program.  These are reactions to my discussions
with Bud Frawley and Howard Austin, on Sept. 9, 1980. 

			---  Doug Lenat


Dealing with a very large number of rules

One of the problems that this project will face, as the number of
rules reaches ever higher, is the increasing cost of running all the
rules all the time  (that is, evaluating ALL their "IF" parts to see
which ones are relevant and should have their "THEN" parts obeyed).
In any given situation, only a small subset of the rules will be
even POTENTIALLY relevant; most of them would be ridiculously 
nonsensical because they check for situations having NOTHING to do
with the current one.  Below are four techniques for reducing
the size of the set of rules which have to have their "IF" parts
tested -- all the rest can be safely assumed to be irrelevant.
One good thing about these is that the four techniques are independent:
any or all of them can be employed.

Imagine all the dipmeter concepts arranged in a big tree, a hierarchy,
linked by arcs labelled "more general than" and "more specialized than".
At the top would be the few most general concepts, like "log", and at
the bottom would be the most specific ones, like "multiple small red
patterns within a blue".  Attach each rule to the concept to which it's
most relevant; try to make it the MOST GENERAL concept to which it's
relevant.  Notice that the rule is also relevant to the specializations
of that concept  (e.g., any rule that tells you about ALL log patterns
-- such as that one wild tadpole can be ignored -- certainly also
is relevant to a specialization of that concept, such as the "blue within
red log pattern" concept). Now, given the recognition of a very specific
pattern, all the program needs to do is run the rules associated with
that specific concept, and its generalizations;  the other rules can
be presumed to be irrelevant.  Thus, you are only dealing with LOG(n)
rules, instead of the full set of n rules.

A second technique can be used to cut a simple CONSTANT factor off the
number of rules that have to be considered:  divide up the task into
phases, or substasks, such as Validity-check, user-input, missing-sections,
stratigraphy, etc.  Decide for each rule which of these subtasks it is
relevant to, and let each subtask know the names of the rules relevant
to it.  Then it need only look at those  n/8 rules, rather than n  (assuming
you've split the task into 8 subtasks).

A third technique can be used.  Like the second one, this also cuts
only a linear factor away from the number of rules that need to be
considered as potentially relevant.   Pick a few aspects that all the
subtasks have in common, such as Initialization, Checking, Finding,
Reporting, etc., and classify each rule in that way, too.

The fourth technique is to split the IF part of a rule into two parts:
an IF-potentially-relevant part and an IF-truly-relevant part.
The former is always filled with a VERY FAST test, wheras the latter
contains a much slower but more complete predicate.  To use this,
the program first runs the IF-potentially-relevant parts of all the
rules, and only runs the more costly IF part on those who survive
(return True for) the quick pre-condition check.  This doesn't
reduce the NUMBER of rules which must be run, but can dramatically
speed up the time involved in examining them all.  On a small scale,
the same effect is achieved by carefully ordering the conjuncts of
an AND expression in Lisp, so that the first ones are very fast.

Suppose the data base contains 500 concepts, and a total of 5,000 rules.
Employing technique 1 alone, the number of rules found by "rippling up"
the generalization links is only about 100.  Techniques 2 and 3 reduce
this to about 4.  Technique 4 makes it very fast to tell which of
these four remaining rules have some chance of firing.  Of the, say, 2
which remain, one or both of them will actually be relevant.  Note how
the techniques essentially eliminate the problem of "too many rules".
Another benefit comes when it's necessary to modify the set of rules --
by adding a new one, modifying an existing rule, etc. -- and it's necessary
to understand the impact of that change	on related rules.  The organization
and classification of rules means that there will be very few rules
which the new/delted/changed rule could POSSIBLY interact with, and all of
these can be found automatically very quickly.  Also, by following the advice
of suggestion 1, each rule will be at its "most general possible level",
hence there is as good a chance as possible that it will be "above" any
change you later make to the data base, i.e., that it won't have to be
changed in any way.

The way the rules are currently written, they each know a LOT about
each other.  To make the rule system truly additive, to make the
augmentation of the system by a new rule an easy event, it is
necessary to eliminate as much of this "innate knowledge" as possible.
One of the chief culprits currently is the long idiosyncratic
list of paramters each rule gets and passes along, the attribute
called VARIABLES on the property list of each rule.  That would
be bette off being replaced by VARIABLE-TYPE.  For instance,
rule 13 currently has that attribute filled with the list
(type pattern-top pattern-bot azimuth trend-top trend-bot).  If
instead it simply had an attribute called VARIABLE-TYPE, we could
have it filled with the atom LogPatternSegment.  The latter would
have a property list, among whose attributes would be VARIABLES.
Then, when the format of that long list of variables changed,
only one place would have to get changed, not hundreds.   This
is the kind of change which should be effected as soon as possible.


What are the rules based on?

Several simple solutions to this are possible:  store a textual
response, store a list of machine-usable reasons, store a single
numeric certainty factor, classify each rule as either Definite
or merely Probable, etc.  The advantage of the "list of machine-
readable reasons" is that from it can be generated each of the
other types of epistemological tags mentioned.  Also, you may
at times need to compare the types of justifications for various
rules, e.g. if some higher-order rule (metarule) told you
"If two rules conflict, and one is based on empirical evidence
and the other is logically true, Then prefer the latter".  The
program can always cache away (store) the values of the certainty
factors (CFs) it computes from these more descriptive reasons.
These CFs can then be used when a quick and dirty estimate of the
overall validity of the rule is required.

Other attributes of rules

Rules can be related to each other in many ways, and this could come
in handy; e.g., if some metarule said "When a rule fails, try a
different rule which has roughly the same purpose but which is
known to succeed more frequently".  To do this easily, it would
be nice if each rule had a "Percentage of success" attribute.
Some other useful attribuates (slots, properties on its property
list, etc.) would be:  average-cpu-time, average-number-of-questions-
asked-of-the-user,  names-of-more-general-rules, etc.


How should the overall log interpretation process be managed?

Currently, the problem is broken into 7 phases (valid-check,
user-input, valid-check-II, user-input-II, missing-sections,
pattern recognition, stratigraphy), which are rigidly run one
after the other, and then the program halts.  Instead, it might
be more natural to have an AGENDA containing a few "tasks",
such as Check-validity, Missing-Sections, Stratigraphy, and
each of these would have some symbolic reasons (and a cached
number representing its overall worth).  The program chooses the
task with the highest overall worth, and works on it for a while.
It is then suspended and placed back onthe agenda, and a new task
will probably have a higher priority rating and be run next.
During the running of a task, it may suggest new tasks for the
agenda.  This kind of control structure allows the program to
switch back and forth between the tasks of guessing the structural
dip and the possible faults, letting the latest results from
one influence a new round of hypothesizing about the other.


How can the program "really understand"?

One step along the way is to make as much knowledge as possible EXPLICIT.
Thus, the experts know that sections of the log can repeat, and they
know what that means geologically.  The program should have that kind
of information explicitly represented within it.  Thus it might have
the horseshoe warping "script", which accounts for a repeated reversed
pattern; it might have the "crack and slide" script to account for
a repeated nonreversed segment; etc.  Geological knowledge per se may
seem superfluous to the functioning of the program, but it will in
the long run be a nesessity, if the the program is not to be
continually reworked in excruciating detail by humans each time
the task it's to perform changes a little.  General knowledge,
such as that of causality, physics, etc., is also required.

Another example of troubles you get into when you know more than
is explicitly represented in your program is the following:
You have a rule whose IF part has three conjuncts; how do you
distinguish the ones you want the system to quickly see if are
known to be true, from those which you want the system to spend
time WORKING to see if they are true (i.e., by recurring), from
those which you want the system to ACHIEVE (i.e., make true if
possible)?  The answer is to replace the vauge IF part by several
specialized parts, an IF-known-to-be-true, IF-you-can-show, and
an IF-you-can-achieve part.  A similar solution is to specify,
for each conjunct in the IF part each rule, how many resources
of each type may be brought to bear in testing it  (how much
cpu time, real time, list cells, queries to the user, etc.)